Evaluating Human-Machine Conversation for Appropriateness

نویسندگان

Nick Webb

David Benyon

Preben Hansen

Oli Mival

چکیده

Evaluation of complex, collaborative dialogue systems is a difficult task. Traditionally, developers have relied upon subjective feedback from the user, and parametrisation over observable metrics. However, both models place some reliance on the notion of a task; that is, the system is helping to user achieve some clearly defined goal, such as book a flight or complete a banking transaction. It is not clear that such metrics are as useful when dealing with a system that has a more complex task, or even no definable task at all, beyond maintain and performing a collaborative dialogue. Working within the EU funded COMPANIONS program, we investigate the use of appropriateness as a measure of conversation quality, the hypothesis being that good companions need to be good conversational partners . We report initial work in the direction of annotating dialogue for indicators of good conversation, including the annotation and comparison of the output of two generations of the same dialogue system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Institute of Informatics Logics and Security Studies Evaluating Human-Machine Conversation for Appropriateness

متن کامل

Evaluating Human-Computer Conversation in Companions

We report on the first evaluation of the Companions project prototypes. We give preliminary results from our phase one evaluation, using known and well-understood dialogue metrics. We also give a first indication of the directions we plan to take to evaluate increasingly sophisticated conversational systems, using measures of coherence and appropriateness. 12

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

The Appropriateness of Educational Programs' Objectives for Professional Needs: The Viewpoints of Khorramabad School of Nursing and Midwifery Graduates

Introduction: Evaluating the educational programs from the viewpoints of graduates may identify the weaknesses of such programs and provide the opportunity for their improvement. This study was performed to determine the appropriateness of educational programs for professional needs from the viewpoints of graduates of Khorramabad School of Nursing and Midwifery. Methods: This descriptive cros...

متن کامل

A New Statistical Model for Evaluation Interactive Question Answering Systems Using Regression

The development of computer systems and extensive use of information technology in the everyday life of people have just made it more and more important for them to make quick access to information that has received great importance. Increasing the volume of information makes it difficult to manage or control. Thus, some instruments need to be provided to use this information. The QA system is ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Evaluating Human-Machine Conversation for Appropriateness

نویسندگان

چکیده

منابع مشابه

Institute of Informatics Logics and Security Studies Evaluating Human-Machine Conversation for Appropriateness

Evaluating Human-Computer Conversation in Companions

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

The Appropriateness of Educational Programs' Objectives for Professional Needs: The Viewpoints of Khorramabad School of Nursing and Midwifery Graduates

A New Statistical Model for Evaluation Interactive Question Answering Systems Using Regression

عنوان ژورنال:

اشتراک گذاری